Search results for "Bot detection"

showing 7 items of 7 documents

Time series clustering with different distance measures to tell Web bots and humans apart

2022

The paper deals with the problem of differentiating Web sessions of bots and human users by observing some characteristics of their traffic at the Web server input. We propose an approach to cluster bots’ and humans’ sessions represented as time series. First, sessions are expressed as sequences of HTTP requests coming to the server at specific timestamps; then, they are pre-preprocessed to form time series of limited length. Time series are clustered and the clustering performance is evaluated in terms of the ability to partition bots and humans into separate clusters. The proposed approach is applied to real server log data and validated with the use of different time series distance meas…

Web sessionTime seriesUnsupervised classificationWeb bot detectionInternet robotSimilarity measureWeb botClusteringDistance measureECMS 2022 Proceedings edited by Ibrahim A. Hameed, Agus Hasan, Saleh Abdel-Afou Alaliyat
researchProduct

Improving clustering of Web bot and human sessions by applying Principal Component Analysis

2019

View references (18) The paper addresses the problem of modeling Web sessions of bots and legitimate users (humans) as feature vectors for their use at the input of classification models. So far many different features to discriminate bots’ and humans’ navigational patterns have been considered in session models but very few studies were devoted to feature selection and dimensionality reduction in the context of bot detection. We propose applying Principal Component Analysis (PCA) to develop improved session models based on predictor variables being efficient discriminants of Web bots. The proposed models are used in session clustering, whose performance is evaluated in terms of the purity …

Bot detectionPrincipal Component AnalysisPCALog analysisComputer sciencek-meansInternet robotcomputer.software_genreClassificationWeb botDimensionality reductionClusteringWeb serverPrincipal component analysisFeature selectionData miningCluster analysiscomputerCommunications of the ECMS
researchProduct

Online Web Bot Detection Using a Sequential Classification Approach

2019

A significant problem nowadays is detection of Web traffic generated by automatic software agents (Web bots). Some studies have dealt with this task by proposing various approaches to Web traffic classification in order to distinguish the traffic stemming from human users' visits from that generated by bots. Most of previous works addressed the problem of offline bot recognition, based on available information on user sessions completed on a Web server. Very few approaches, however, have been proposed to recognize bots online, before the session completes. This paper proposes a novel approach to binary classification of a multivariate data stream incoming on a Web server, in order to recogn…

Web serverHTTP request analysis; Internet security; Machine learning; Neural networks; Sequential classification; Web bot detectionSettore INF/01 - InformaticaWeb bot detectionComputer sciencebusiness.industrySequential classification020206 networking & telecommunications02 engineering and technologyMachine learningcomputer.software_genreInternet securitySession (web analytics)Task (computing)Web trafficMachine learning0202 electrical engineering electronic engineering information engineeringHTTP request analysis020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerNeural networksInternet security2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
researchProduct

A Quantum-Inspired Classifier for Early Web Bot Detection

2022

This paper introduces a novel approach, inspired by the principles of Quantum Computing, to address web bot detection in terms of real-time classification of an incoming data stream of HTTP request headers, in order to ensure the shortest decision time with the highest accuracy. The proposed approach exploits the analogy between the intrinsic correlation of two or more particles and the dependence of each HTTP request on the preceding ones. Starting from the a-posteriori probability of each request to belong to a particular class, it is possible to assign a Qubit state representing a combination of the aforementioned probabilities for all available observations of the time series. By levera…

Settore INF/01 - InformaticaComputer Networks and Communicationsbot detectionData modelsTime series analysisearly decisionquantum-inspired computingTime measurementCorrelationCostsmultinomial classificationPredictive modelsbot detection; Correlation; Costs; Data models; early decision; multinomial classification; multivariate sequence classification; Predictive models; quantum-inspired computing; sequential classification; Task analysis; Time measurement; Time series analysis;multivariate sequence classificationTask analysisSafety Risk Reliability and Qualitybot detection; Correlation; Costs; Data models; early decision; multinomial classification; multivariate sequence classification; Predictive models; quantum-inspired computing; sequential classification; Task analysis; Time measurement; Time series analysissequential classification
researchProduct

Bot recognition in a Web store: An approach based on unsupervised learning

2020

Abstract Web traffic on e-business sites is increasingly dominated by artificial agents (Web bots) which pose a threat to the website security, privacy, and performance. To develop efficient bot detection methods and discover reliable e-customer behavioural patterns, the accurate separation of traffic generated by legitimate users and Web bots is necessary. This paper proposes a machine learning solution to the problem of bot and human session classification, with a specific application to e-commerce. The approach studied in this work explores the use of unsupervised learning (k-means and Graded Possibilistic c-Means), followed by supervised labelling of clusters, a generative learning stra…

Unsupervised classificationWeb bot detectionComputer Networks and CommunicationsComputer scienceInternet robot02 engineering and technologyMachine learningcomputer.software_genreWeb trafficWeb serverMachine learning0202 electrical engineering electronic engineering information engineeringArtificial neural networkbusiness.industrySupervised learning020206 networking & telecommunicationsPerceptronWeb application securityWeb botComputer Science ApplicationsSupport vector machineGenerative modelComputingMethodologies_PATTERNRECOGNITIONHardware and ArchitectureSupervised classificationUnsupervised learning020201 artificial intelligence & image processingArtificial intelligencebusinesscomputer
researchProduct

Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach

2020

Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…

Web userInformation Systems and ManagementComputer scienceInternet robotFeature selection02 engineering and technologyMachine learningcomputer.software_genreUnsupervised learningSession (web analytics)Management Information SystemsIntelligent agentArtificial Intelligence020204 information systemsMachine learning0202 electrical engineering electronic engineering information engineeringCluster analysisBot detectionbusiness.industryInformation bottleneck methodWeb botServer logHierarchical clusteringUnsupervised learning020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerSoftwareKnowledge-Based Systems
researchProduct

Efficient on-the-fly Web bot detection

2021

Abstract A large fraction of traffic on present-day Web servers is generated by bots — intelligent agents able to traverse the Web and execute various advanced tasks. Since bots’ activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as “bot” or “human”. A machine learning appro…

Web serverInformation Systems and ManagementComputer scienceInternet robot02 engineering and technologyMachine learningcomputer.software_genreUsage dataManagement Information SystemsIntelligent agentEarly decision; Internet robot; Machine learning; Neural network; Real-time bot detection; Sequential analysis; Web botArtificial IntelligenceReal-time bot detection020204 information systemsMachine learning0202 electrical engineering electronic engineering information engineeringFalse positive paradoxSequential analysisSession (computer science)business.industryWeb botNeural networkEarly decisionTraffic classificationBinary classification020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerClassifier (UML)SoftwareKnowledge-Based Systems
researchProduct